Overview
Brought to you by YData
Dataset statistics
| Number of variables | 34 |
|---|---|
| Number of observations | 19372 |
| Missing cells | 371611 |
| Missing cells (%) | 56.4% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 25.8 MiB |
| Average record size in memory | 1.4 KiB |
Variable types
| Text | 7 |
|---|---|
| Numeric | 3 |
| Categorical | 16 |
| Boolean | 6 |
| Unsupported | 2 |
WHOLE_GENOME_SCREEN has constant value "False" | Constant |
REARRANGEMENT_SCREEN has constant value "False" | Constant |
MSI has constant value "Unknown" | Constant |
CYTOGENETICS is highly overall correlated with GENDER and 5 other fields | High correlation |
ENVIRONMENTAL_VARIABLES is highly overall correlated with INDIVIDUAL_ID and 5 other fields | High correlation |
ETHNICITY is highly overall correlated with INDIVIDUAL_ID and 9 other fields | High correlation |
FAMILY is highly overall correlated with GERMLINE_MUTATION and 9 other fields | High correlation |
GENDER is highly overall correlated with CYTOGENETICS and 3 other fields | High correlation |
GERMLINE_MUTATION is highly overall correlated with FAMILY and 7 other fields | High correlation |
GRADE is highly overall correlated with FAMILY and 9 other fields | High correlation |
INDIVIDUAL_ID is highly overall correlated with CYTOGENETICS and 12 other fields | High correlation |
INDIVIDUAL_REMARK is highly overall correlated with GENDER and 11 other fields | High correlation |
METASTATIC_SITE is highly overall correlated with CYTOGENETICS and 6 other fields | High correlation |
MUTATION_ALLELE_SPECIFICATION is highly overall correlated with ETHNICITY and 9 other fields | High correlation |
NORMAL_TISSUE_TESTED is highly overall correlated with ETHNICITY and 7 other fields | High correlation |
RNASEQ_SCREEN is highly overall correlated with CYTOGENETICS and 11 other fields | High correlation |
SAMPLE_TYPE is highly overall correlated with CYTOGENETICS and 5 other fields | High correlation |
STAGE is highly overall correlated with GENDER and 8 other fields | High correlation |
TARGETED_SCREEN is highly overall correlated with ENVIRONMENTAL_VARIABLES and 10 other fields | High correlation |
THERAPY is highly overall correlated with ETHNICITY and 9 other fields | High correlation |
TUMOUR_ID is highly overall correlated with CYTOGENETICS and 12 other fields | High correlation |
TUMOUR_REMARK is highly overall correlated with FAMILY and 10 other fields | High correlation |
TUMOUR_SOURCE is highly overall correlated with ENVIRONMENTAL_VARIABLES and 8 other fields | High correlation |
WHOLE_EXOME_SCREEN is highly overall correlated with ENVIRONMENTAL_VARIABLES and 10 other fields | High correlation |
SAMPLE_TYPE is highly imbalanced (53.6%) | Imbalance |
WHOLE_EXOME_SCREEN is highly imbalanced (97.1%) | Imbalance |
TARGETED_SCREEN is highly imbalanced (96.8%) | Imbalance |
RNASEQ_SCREEN is highly imbalanced (99.5%) | Imbalance |
GRADE is highly imbalanced (89.4%) | Imbalance |
STAGE is highly imbalanced (73.8%) | Imbalance |
THERAPY is highly imbalanced (74.8%) | Imbalance |
FAMILY is highly imbalanced (71.3%) | Imbalance |
NORMAL_TISSUE_TESTED has 17683 (91.3%) missing values | Missing |
AGE has 15740 (81.3%) missing values | Missing |
THERAPY_RELATIONSHIP has 16397 (84.6%) missing values | Missing |
SAMPLE_DIFFERENTIATOR has 19303 (99.6%) missing values | Missing |
MUTATION_ALLELE_SPECIFICATION has 19266 (99.5%) missing values | Missing |
AVERAGE_PLOIDY has 19372 (100.0%) missing values | Missing |
SAMPLE_REMARK has 19372 (100.0%) missing values | Missing |
DRUG_RESPONSE has 17537 (90.5%) missing values | Missing |
GRADE has 19231 (99.3%) missing values | Missing |
AGE_AT_TUMOUR_RECURRENCE has 19335 (99.8%) missing values | Missing |
STAGE has 19097 (98.6%) missing values | Missing |
CYTOGENETICS has 19330 (99.8%) missing values | Missing |
METASTATIC_SITE has 18912 (97.6%) missing values | Missing |
TUMOUR_REMARK has 18428 (95.1%) missing values | Missing |
ETHNICITY has 17527 (90.5%) missing values | Missing |
ENVIRONMENTAL_VARIABLES has 19284 (99.5%) missing values | Missing |
GERMLINE_MUTATION has 19356 (99.9%) missing values | Missing |
THERAPY has 18066 (93.3%) missing values | Missing |
FAMILY has 19240 (99.3%) missing values | Missing |
INDIVIDUAL_REMARK has 19135 (98.8%) missing values | Missing |
COSMIC_SAMPLE_ID has unique values | Unique |
AVERAGE_PLOIDY is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
SAMPLE_REMARK is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Reproduction
| Analysis started | 2025-07-15 00:50:38.700461 |
|---|---|
| Analysis finished | 2025-07-15 00:50:43.836070 |
| Duration | 5.14 seconds |
| Software version | ydata-profiling vv4.16.1 |
| Download configuration | config.json |
Variables
COSMIC_SAMPLE_ID
Text
Unique 
| Distinct | 19372 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.4 MiB |
Length
| Max length | 11 |
|---|---|
| Median length | 11 |
| Mean length | 10.84617 |
| Min length | 10 |
Unique
| Unique | 19372 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | COSS1496935 |
|---|---|
| 2nd row | COSS1036427 |
| 3rd row | COSS1496985 |
| 4th row | COSS2468028 |
| 5th row | COSS820950 |
| Value | Count | Frequency (%) |
| coss1930149 | 1 | < 0.1% |
| coss1182421 | 1 | < 0.1% |
| coss1496935 | 1 | < 0.1% |
| coss1036427 | 1 | < 0.1% |
| coss1496985 | 1 | < 0.1% |
| coss2468028 | 1 | < 0.1% |
| coss820950 | 1 | < 0.1% |
| coss1731941 | 1 | < 0.1% |
| coss1384396 | 1 | < 0.1% |
| coss1544835 | 1 | < 0.1% |
| Other values (19362) | 19362 |
Most occurring characters
| Value | Count | Frequency (%) |
| S | 38744 | |
| 1 | 22767 | |
| 2 | 20382 | |
| O | 19372 | |
| C | 19372 | |
| 9 | 14989 | 7.1% |
| 0 | 13345 | 6.4% |
| 3 | 11364 | 5.4% |
| 8 | 10954 | 5.2% |
| 4 | 10309 | 4.9% |
| Other values (3) | 28514 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 210112 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| S | 38744 | |
| 1 | 22767 | |
| 2 | 20382 | |
| O | 19372 | |
| C | 19372 | |
| 9 | 14989 | 7.1% |
| 0 | 13345 | 6.4% |
| 3 | 11364 | 5.4% |
| 8 | 10954 | 5.2% |
| 4 | 10309 | 4.9% |
| Other values (3) | 28514 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 210112 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| S | 38744 | |
| 1 | 22767 | |
| 2 | 20382 | |
| O | 19372 | |
| C | 19372 | |
| 9 | 14989 | 7.1% |
| 0 | 13345 | 6.4% |
| 3 | 11364 | 5.4% |
| 8 | 10954 | 5.2% |
| 4 | 10309 | 4.9% |
| Other values (3) | 28514 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 210112 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| S | 38744 | |
| 1 | 22767 | |
| 2 | 20382 | |
| O | 19372 | |
| C | 19372 | |
| 9 | 14989 | 7.1% |
| 0 | 13345 | 6.4% |
| 3 | 11364 | 5.4% |
| 8 | 10954 | 5.2% |
| 4 | 10309 | 4.9% |
| Other values (3) | 28514 |
SAMPLE_NAME
Text
| Distinct | 19269 |
|---|---|
| Distinct (%) | 99.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.3 MiB |
Length
| Max length | 17 |
|---|---|
| Median length | 7 |
| Mean length | 6.8540161 |
| Min length | 1 |
Unique
| Unique | 19221 ? |
|---|---|
| Unique (%) | 99.2% |
Sample
| 1st row | 1496935 |
|---|---|
| 2nd row | 1036427 |
| 3rd row | 1496985 |
| 4th row | 2468028 |
| 5th row | E17588 |
| Value | Count | Frequency (%) |
| 10 | 6 | < 0.1% |
| 8 | 5 | < 0.1% |
| 9 | 5 | < 0.1% |
| 5 | 5 | < 0.1% |
| 6 | 5 | < 0.1% |
| 1 | 5 | < 0.1% |
| 7 | 5 | < 0.1% |
| 4 | 5 | < 0.1% |
| 11 | 4 | < 0.1% |
| 2 | 4 | < 0.1% |
| Other values (19257) | 19323 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 22613 | |
| 2 | 19059 | |
| 9 | 14888 | |
| 0 | 13317 | |
| 3 | 11998 | |
| 4 | 10208 | |
| 8 | 9627 | |
| 6 | 9472 | |
| 5 | 9146 | |
| 7 | 8939 | 6.7% |
| Other values (30) | 3509 | 2.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 132776 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1 | 22613 | |
| 2 | 19059 | |
| 9 | 14888 | |
| 0 | 13317 | |
| 3 | 11998 | |
| 4 | 10208 | |
| 8 | 9627 | |
| 6 | 9472 | |
| 5 | 9146 | |
| 7 | 8939 | 6.7% |
| Other values (30) | 3509 | 2.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 132776 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1 | 22613 | |
| 2 | 19059 | |
| 9 | 14888 | |
| 0 | 13317 | |
| 3 | 11998 | |
| 4 | 10208 | |
| 8 | 9627 | |
| 6 | 9472 | |
| 5 | 9146 | |
| 7 | 8939 | 6.7% |
| Other values (30) | 3509 | 2.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 132776 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1 | 22613 | |
| 2 | 19059 | |
| 9 | 14888 | |
| 0 | 13317 | |
| 3 | 11998 | |
| 4 | 10208 | |
| 8 | 9627 | |
| 6 | 9472 | |
| 5 | 9146 | |
| 7 | 8939 | 6.7% |
| Other values (30) | 3509 | 2.6% |
| Distinct | 94 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.4 MiB |
Length
| Max length | 13 |
|---|---|
| Median length | 12 |
| Mean length | 12.002426 |
| Min length | 12 |
Unique
| Unique | 20 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | COSO31355381 |
|---|---|
| 2nd row | COSO28815381 |
| 3rd row | COSO31355381 |
| 4th row | COSO36075381 |
| 5th row | COSO28775381 |
| Value | Count | Frequency (%) |
| coso36605381 | 7704 | |
| coso31355381 | 5178 | |
| coso36075381 | 2672 | 13.8% |
| coso28815381 | 920 | 4.7% |
| coso36075385 | 553 | 2.9% |
| coso28815385 | 479 | 2.5% |
| coso28775381 | 367 | 1.9% |
| coso36075546 | 251 | 1.3% |
| coso36075763 | 209 | 1.1% |
| coso37735381 | 187 | 1.0% |
| Other values (84) | 852 | 4.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 41806 | |
| O | 38744 | |
| 5 | 26554 | |
| 1 | 24185 | |
| 8 | 22299 | |
| 6 | 20180 | |
| C | 19372 | |
| S | 19372 | |
| 0 | 11509 | 4.9% |
| 7 | 5708 | 2.5% |
| Other values (3) | 2782 | 1.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 232511 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 3 | 41806 | |
| O | 38744 | |
| 5 | 26554 | |
| 1 | 24185 | |
| 8 | 22299 | |
| 6 | 20180 | |
| C | 19372 | |
| S | 19372 | |
| 0 | 11509 | 4.9% |
| 7 | 5708 | 2.5% |
| Other values (3) | 2782 | 1.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 232511 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 3 | 41806 | |
| O | 38744 | |
| 5 | 26554 | |
| 1 | 24185 | |
| 8 | 22299 | |
| 6 | 20180 | |
| C | 19372 | |
| S | 19372 | |
| 0 | 11509 | 4.9% |
| 7 | 5708 | 2.5% |
| Other values (3) | 2782 | 1.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 232511 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 3 | 41806 | |
| O | 38744 | |
| 5 | 26554 | |
| 1 | 24185 | |
| 8 | 22299 | |
| 6 | 20180 | |
| C | 19372 | |
| S | 19372 | |
| 0 | 11509 | 4.9% |
| 7 | 5708 | 2.5% |
| Other values (3) | 2782 | 1.2% |
TUMOUR_ID
Real number (ℝ)
High correlation 
| Distinct | 19229 |
|---|---|
| Distinct (%) | 99.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1703638.8 |
| Minimum | 639922 |
|---|---|
| Maximum | 2826849 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 302.7 KiB |
Quantile statistics
| Minimum | 639922 |
|---|---|
| 5-th percentile | 740420.55 |
| Q1 | 1126645.8 |
| median | 1815975.5 |
| Q3 | 2294631.2 |
| 95-th percentile | 2760479.5 |
| Maximum | 2826849 |
| Range | 2186927 |
| Interquartile range (IQR) | 1167985.5 |
Descriptive statistics
| Standard deviation | 628388.95 |
|---|---|
| Coefficient of variation (CV) | 0.36885105 |
| Kurtosis | -1.1446555 |
| Mean | 1703638.8 |
| Median Absolute Deviation (MAD) | 529103 |
| Skewness | 0.0022162929 |
| Sum | 3.3002891 × 1010 |
| Variance | 3.9487268 × 1011 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1064091 | 7 | < 0.1% |
| 1079647 | 7 | < 0.1% |
| 1079643 | 7 | < 0.1% |
| 1079619 | 7 | < 0.1% |
| 981099 | 6 | < 0.1% |
| 1079633 | 6 | < 0.1% |
| 1079637 | 6 | < 0.1% |
| 1079627 | 5 | < 0.1% |
| 1079644 | 5 | < 0.1% |
| 1079621 | 5 | < 0.1% |
| Other values (19219) | 19311 |
| Value | Count | Frequency (%) |
| 639922 | 1 | |
| 683145 | 1 | |
| 683146 | 1 | |
| 683147 | 1 | |
| 683148 | 1 | |
| 683149 | 1 | |
| 683150 | 1 | |
| 683151 | 1 | |
| 683152 | 1 | |
| 683153 | 1 |
| Value | Count | Frequency (%) |
| 2826849 | 1 | |
| 2826848 | 1 | |
| 2826847 | 1 | |
| 2811858 | 1 | |
| 2811857 | 1 | |
| 2808140 | 1 | |
| 2808139 | 1 | |
| 2808138 | 1 | |
| 2808137 | 1 | |
| 2808136 | 1 |
SAMPLE_TYPE
Categorical
High correlation  Imbalance 
| Distinct | 13 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.4 MiB |
| surgery-fixed | |
|---|---|
| surgery - NOS | |
| surgery fresh/frozen | |
| fixed - NOS | 801 |
| NS | 624 |
| Other values (8) | 180 |
Length
| Max length | 20 |
|---|---|
| Median length | 13 |
| Mean length | 13.145003 |
| Min length | 2 |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | surgery-fixed |
|---|---|
| 2nd row | surgery-fixed |
| 3rd row | surgery-fixed |
| 4th row | surgery-fixed |
| 5th row | surgery-fixed |
Common Values
| Value | Count | Frequency (%) |
| surgery-fixed | 10572 | |
| surgery - NOS | 5680 | |
| surgery fresh/frozen | 1515 | 7.8% |
| fixed - NOS | 801 | 4.1% |
| NS | 624 | 3.2% |
| fine needle aspirate | 72 | 0.4% |
| fresh/frozen - NOS | 45 | 0.2% |
| autopsy-fixed | 26 | 0.1% |
| cell-line | 23 | 0.1% |
| circulating tumour | 5 | < 0.1% |
| Other values (3) | 9 | < 0.1% |
Length
| Value | Count | Frequency (%) |
| surgery-fixed | 10572 | |
| surgery | 7195 | |
| 6526 | ||
| nos | 6526 | |
| fresh/frozen | 1560 | 4.6% |
| fixed | 801 | 2.3% |
| ns | 624 | 1.8% |
| fine | 72 | 0.2% |
| needle | 72 | 0.2% |
| aspirate | 72 | 0.2% |
| Other values (9) | 73 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 38754 | |
| e | 32705 | |
| s | 19429 | 7.6% |
| u | 17816 | 7.0% |
| y | 17793 | 7.0% |
| g | 17776 | 7.0% |
| - | 17152 | 6.7% |
| 14721 | 5.8% | |
| f | 14595 | 5.7% |
| i | 11576 | 4.5% |
| Other values (18) | 52328 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 254645 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| r | 38754 | |
| e | 32705 | |
| s | 19429 | 7.6% |
| u | 17816 | 7.0% |
| y | 17793 | 7.0% |
| g | 17776 | 7.0% |
| - | 17152 | 6.7% |
| 14721 | 5.8% | |
| f | 14595 | 5.7% |
| i | 11576 | 4.5% |
| Other values (18) | 52328 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 254645 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| r | 38754 | |
| e | 32705 | |
| s | 19429 | 7.6% |
| u | 17816 | 7.0% |
| y | 17793 | 7.0% |
| g | 17776 | 7.0% |
| - | 17152 | 6.7% |
| 14721 | 5.8% | |
| f | 14595 | 5.7% |
| i | 11576 | 4.5% |
| Other values (18) | 52328 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 254645 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| r | 38754 | |
| e | 32705 | |
| s | 19429 | 7.6% |
| u | 17816 | 7.0% |
| y | 17793 | 7.0% |
| g | 17776 | 7.0% |
| - | 17152 | 6.7% |
| 14721 | 5.8% | |
| f | 14595 | 5.7% |
| i | 11576 | 4.5% |
| Other values (18) | 52328 |
INDIVIDUAL_ID
Real number (ℝ)
High correlation 
| Distinct | 18249 |
|---|---|
| Distinct (%) | 94.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1618704.4 |
| Minimum | 620961 |
|---|---|
| Maximum | 2659684 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 302.7 KiB |
Quantile statistics
| Minimum | 620961 |
|---|---|
| 5-th percentile | 722926.55 |
| Q1 | 1099513.8 |
| median | 1729265.5 |
| Q3 | 2146195.2 |
| 95-th percentile | 2596013 |
| Maximum | 2659684 |
| Range | 2038723 |
| Interquartile range (IQR) | 1046681.5 |
Descriptive statistics
| Standard deviation | 576460.23 |
|---|---|
| Coefficient of variation (CV) | 0.35612446 |
| Kurtosis | -1.1041569 |
| Mean | 1618704.4 |
| Median Absolute Deviation (MAD) | 466146 |
| Skewness | -0.015385481 |
| Sum | 3.1357542 × 1010 |
| Variance | 3.323064 × 1011 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1181721 | 17 | 0.1% |
| 1464316 | 10 | 0.1% |
| 1099583 | 10 | 0.1% |
| 1037922 | 8 | < 0.1% |
| 2197216 | 8 | < 0.1% |
| 1099581 | 8 | < 0.1% |
| 887717 | 7 | < 0.1% |
| 1053243 | 7 | < 0.1% |
| 887714 | 7 | < 0.1% |
| 1111312 | 7 | < 0.1% |
| Other values (18239) | 19283 |
| Value | Count | Frequency (%) |
| 620961 | 1 | |
| 665613 | 1 | |
| 665614 | 1 | |
| 665615 | 1 | |
| 665616 | 1 | |
| 665617 | 1 | |
| 665618 | 1 | |
| 665619 | 1 | |
| 665620 | 1 | |
| 665621 | 1 |
| Value | Count | Frequency (%) |
| 2659684 | 1 | |
| 2659683 | 1 | |
| 2659682 | 1 | |
| 2645583 | 1 | |
| 2645582 | 1 | |
| 2642377 | 1 | |
| 2642376 | 1 | |
| 2642375 | 1 | |
| 2642374 | 1 | |
| 2642373 | 1 |
WHOLE_GENOME_SCREEN
Boolean
Constant 
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 170.3 KiB |
| False |
|---|
| Value | Count | Frequency (%) |
| False | 19372 |
WHOLE_EXOME_SCREEN
Boolean
High correlation  Imbalance 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 170.3 KiB |
| False | |
|---|---|
| True | 56 |
| Value | Count | Frequency (%) |
| False | 19316 | |
| True | 56 | 0.3% |
TARGETED_SCREEN
Boolean
High correlation  Imbalance 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 170.3 KiB |
| True | |
|---|---|
| False | 63 |
| Value | Count | Frequency (%) |
| True | 19309 | |
| False | 63 | 0.3% |
RNASEQ_SCREEN
Boolean
High correlation  Imbalance 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 170.3 KiB |
| False | |
|---|---|
| True | 7 |
| Value | Count | Frequency (%) |
| False | 19365 | |
| True | 7 | < 0.1% |
REARRANGEMENT_SCREEN
Boolean
Constant 
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 170.3 KiB |
| False |
|---|
| Value | Count | Frequency (%) |
| False | 19372 |
TUMOUR_SOURCE
Categorical
High correlation 
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.3 MiB |
| NS | |
|---|---|
| primary | |
| recurrent | 621 |
| metastasis | 477 |
| secondary | 5 |
Length
| Max length | 10 |
|---|---|
| Median length | 2 |
| Mean length | 3.8925769 |
| Min length | 2 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | primary |
|---|---|
| 2nd row | NS |
| 3rd row | primary |
| 4th row | primary |
| 5th row | primary |
Common Values
| Value | Count | Frequency (%) |
| NS | 12576 | |
| primary | 5693 | |
| recurrent | 621 | 3.2% |
| metastasis | 477 | 2.5% |
| secondary | 5 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| ns | 12576 | |
| primary | 5693 | |
| recurrent | 621 | 3.2% |
| metastasis | 477 | 2.5% |
| secondary | 5 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 13254 | |
| N | 12576 | |
| S | 12576 | |
| a | 6652 | |
| m | 6170 | |
| i | 6170 | |
| y | 5698 | |
| p | 5693 | |
| e | 1724 | 2.3% |
| t | 1575 | 2.1% |
| Other values (6) | 3319 | 4.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 75407 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| r | 13254 | |
| N | 12576 | |
| S | 12576 | |
| a | 6652 | |
| m | 6170 | |
| i | 6170 | |
| y | 5698 | |
| p | 5693 | |
| e | 1724 | 2.3% |
| t | 1575 | 2.1% |
| Other values (6) | 3319 | 4.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 75407 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| r | 13254 | |
| N | 12576 | |
| S | 12576 | |
| a | 6652 | |
| m | 6170 | |
| i | 6170 | |
| y | 5698 | |
| p | 5693 | |
| e | 1724 | 2.3% |
| t | 1575 | 2.1% |
| Other values (6) | 3319 | 4.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 75407 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| r | 13254 | |
| N | 12576 | |
| S | 12576 | |
| a | 6652 | |
| m | 6170 | |
| i | 6170 | |
| y | 5698 | |
| p | 5693 | |
| e | 1724 | 2.3% |
| t | 1575 | 2.1% |
| Other values (6) | 3319 | 4.4% |
NORMAL_TISSUE_TESTED
Boolean
High correlation  Missing 
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 17683 |
| Missing (%) | 91.3% |
| Memory size | 189.2 KiB |
| True | 1115 |
|---|---|
| False | 574 |
| (Missing) |
| Value | Count | Frequency (%) |
| True | 1115 | 5.8% |
| False | 574 | 3.0% |
| (Missing) | 17683 |
GENDER
Categorical
High correlation 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.2 MiB |
| u | |
|---|---|
| m | |
| f |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | u |
|---|---|
| 2nd row | u |
| 3rd row | u |
| 4th row | m |
| 5th row | u |
Common Values
| Value | Count | Frequency (%) |
| u | 15685 | |
| m | 2050 | 10.6% |
| f | 1637 | 8.5% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| u | 15685 | |
| m | 2050 | 10.6% |
| f | 1637 | 8.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| u | 15685 | |
| m | 2050 | 10.6% |
| f | 1637 | 8.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 19372 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| u | 15685 | |
| m | 2050 | 10.6% |
| f | 1637 | 8.5% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 19372 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| u | 15685 | |
| m | 2050 | 10.6% |
| f | 1637 | 8.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 19372 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| u | 15685 | |
| m | 2050 | 10.6% |
| f | 1637 | 8.5% |
AGE
Real number (ℝ)
Missing 
| Distinct | 97 |
|---|---|
| Distinct (%) | 2.7% |
| Missing | 15740 |
| Missing (%) | 81.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 58.743868 |
| Minimum | 0.33 |
|---|---|
| Maximum | 96 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 302.7 KiB |
Quantile statistics
| Minimum | 0.33 |
|---|---|
| 5-th percentile | 30 |
| Q1 | 50 |
| median | 61 |
| Q3 | 69 |
| 95-th percentile | 80 |
| Maximum | 96 |
| Range | 95.67 |
| Interquartile range (IQR) | 19 |
Descriptive statistics
| Standard deviation | 15.274116 |
|---|---|
| Coefficient of variation (CV) | 0.26001209 |
| Kurtosis | 0.64834094 |
| Mean | 58.743868 |
| Median Absolute Deviation (MAD) | 9 |
| Skewness | -0.72165915 |
| Sum | 213357.73 |
| Variance | 233.29862 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 68 | 125 | 0.6% |
| 65 | 118 | 0.6% |
| 64 | 116 | 0.6% |
| 66 | 112 | 0.6% |
| 59 | 109 | 0.6% |
| 54 | 108 | 0.6% |
| 60 | 105 | 0.5% |
| 62 | 102 | 0.5% |
| 67 | 101 | 0.5% |
| 52 | 99 | 0.5% |
| Other values (87) | 2537 | 13.1% |
| (Missing) | 15740 |
| Value | Count | Frequency (%) |
| 0.33 | 1 | < 0.1% |
| 5 | 1 | < 0.1% |
| 6.9 | 1 | < 0.1% |
| 7 | 1 | < 0.1% |
| 8 | 2 | < 0.1% |
| 9 | 2 | < 0.1% |
| 10 | 8 | |
| 11 | 6 | |
| 12 | 13 | |
| 13 | 6 |
| Value | Count | Frequency (%) |
| 96 | 1 | < 0.1% |
| 94 | 1 | < 0.1% |
| 93 | 1 | < 0.1% |
| 92 | 6 | |
| 91 | 2 | < 0.1% |
| 90 | 2 | < 0.1% |
| 89 | 7 | |
| 88 | 9 | |
| 87 | 5 | < 0.1% |
| 86 | 14 |
Missing 
| Distinct | 52 |
|---|---|
| Distinct (%) | 1.7% |
| Missing | 16397 |
| Missing (%) | 84.6% |
| Memory size | 941.0 KiB |
Length
| Max length | 96 |
|---|---|
| Median length | 39 |
| Mean length | 38.43563 |
| Min length | 27 |
Unique
| Unique | 32 ? |
|---|---|
| Unique (%) | 1.1% |
Sample
| 1st row | Sample analysed before imatinib therapy |
|---|---|
| 2nd row | Sample analysed before imatinib therapy |
| 3rd row | Sample analysed during imatinib therapy |
| 4th row | Sample analysed during imatinib therapy |
| 5th row | Sample taken before imatinib therapy |
| Value | Count | Frequency (%) |
| sample | 2975 | |
| therapy | 2965 | |
| imatinib | 2938 | |
| analysed | 2031 | |
| before | 2020 | |
| taken | 944 | 6.2% |
| after | 734 | 4.8% |
| during | 233 | 1.5% |
| and | 53 | 0.3% |
| months | 34 | 0.2% |
| Other values (54) | 240 | 1.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 14750 | |
| e | 13829 | |
| 12193 | ||
| i | 9219 | 8.1% |
| t | 7781 | 6.8% |
| n | 6391 | 5.6% |
| r | 6012 | 5.3% |
| m | 5977 | 5.2% |
| p | 5960 | 5.2% |
| l | 5035 | 4.4% |
| Other values (39) | 27199 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 114346 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| a | 14750 | |
| e | 13829 | |
| 12193 | ||
| i | 9219 | 8.1% |
| t | 7781 | 6.8% |
| n | 6391 | 5.6% |
| r | 6012 | 5.3% |
| m | 5977 | 5.2% |
| p | 5960 | 5.2% |
| l | 5035 | 4.4% |
| Other values (39) | 27199 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 114346 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| a | 14750 | |
| e | 13829 | |
| 12193 | ||
| i | 9219 | 8.1% |
| t | 7781 | 6.8% |
| n | 6391 | 5.6% |
| r | 6012 | 5.3% |
| m | 5977 | 5.2% |
| p | 5960 | 5.2% |
| l | 5035 | 4.4% |
| Other values (39) | 27199 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 114346 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| a | 14750 | |
| e | 13829 | |
| 12193 | ||
| i | 9219 | 8.1% |
| t | 7781 | 6.8% |
| n | 6391 | 5.6% |
| r | 6012 | 5.3% |
| m | 5977 | 5.2% |
| p | 5960 | 5.2% |
| l | 5035 | 4.4% |
| Other values (39) | 27199 |
Missing 
| Distinct | 38 |
|---|---|
| Distinct (%) | 55.1% |
| Missing | 19303 |
| Missing (%) | 99.6% |
| Memory size | 761.4 KiB |
Length
| Max length | 86 |
|---|---|
| Median length | 61 |
| Mean length | 44.42029 |
| Min length | 25 |
Unique
| Unique | 29 ? |
|---|---|
| Unique (%) | 42.0% |
Sample
| 1st row | Sample from spindle cell component |
|---|---|
| 2nd row | Sample from the spindle cell component |
| 3rd row | Sample from dedifferentiated anaplastic component |
| 4th row | Primary pattern (Sarcomatous spindle) |
| 5th row | Sample from dedifferentiated epitheloid/pleomorphic component |
| Value | Count | Frequency (%) |
| sample | 53 | 13.6% |
| from | 43 | 11.1% |
| component | 43 | 11.1% |
| cell | 29 | 7.5% |
| spindle | 24 | 6.2% |
| the | 13 | 3.3% |
| of | 13 | 3.3% |
| epithelioid | 12 | 3.1% |
| pattern | 12 | 3.1% |
| dedifferentiated | 11 | 2.8% |
| Other values (64) | 136 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 335 | 10.9% |
| 321 | 10.5% | |
| o | 239 | 7.8% |
| n | 199 | 6.5% |
| a | 189 | 6.2% |
| t | 184 | 6.0% |
| l | 180 | 5.9% |
| p | 176 | 5.7% |
| m | 176 | 5.7% |
| i | 166 | 5.4% |
| Other values (44) | 900 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 3065 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 335 | 10.9% |
| 321 | 10.5% | |
| o | 239 | 7.8% |
| n | 199 | 6.5% |
| a | 189 | 6.2% |
| t | 184 | 6.0% |
| l | 180 | 5.9% |
| p | 176 | 5.7% |
| m | 176 | 5.7% |
| i | 166 | 5.4% |
| Other values (44) | 900 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 3065 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 335 | 10.9% |
| 321 | 10.5% | |
| o | 239 | 7.8% |
| n | 199 | 6.5% |
| a | 189 | 6.2% |
| t | 184 | 6.0% |
| l | 180 | 5.9% |
| p | 176 | 5.7% |
| m | 176 | 5.7% |
| i | 166 | 5.4% |
| Other values (44) | 900 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 3065 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 335 | 10.9% |
| 321 | 10.5% | |
| o | 239 | 7.8% |
| n | 199 | 6.5% |
| a | 189 | 6.2% |
| t | 184 | 6.0% |
| l | 180 | 5.9% |
| p | 176 | 5.7% |
| m | 176 | 5.7% |
| i | 166 | 5.4% |
| Other values (44) | 900 |
MUTATION_ALLELE_SPECIFICATION
Categorical
High correlation  Missing 
| Distinct | 9 |
|---|---|
| Distinct (%) | 8.5% |
| Missing | 19266 |
| Missing (%) | 99.5% |
| Memory size | 1.3 MiB |
| Secondary mutations located on same chromosome as the primary mutation | |
|---|---|
| Mutations located on single chromosome | |
| Secondary mutations located on same chromosome as the primary mutation. Primary mutation p.V560D and e13 frameshift insertion located on different chromosomes | |
| KIT mutations located on single chromosome | 2 |
| KIT p.V559G primary mutation occurs on same chromosome as p.Y578C | 1 |
| Other values (4) | 4 |
Length
| Max length | 158 |
|---|---|
| Median length | 70 |
| Mean length | 78.915094 |
| Min length | 38 |
Unique
| Unique | 5 ? |
|---|---|
| Unique (%) | 4.7% |
Sample
| 1st row | Secondary mutations located on same chromosome as the primary mutation |
|---|---|
| 2nd row | Mutations located on single chromosome |
| 3rd row | Secondary mutations located on same chromosome as the primary mutation |
| 4th row | Secondary mutations located on same chromosome as the primary mutation |
| 5th row | Mutations located on single chromosome |
Common Values
| Value | Count | Frequency (%) |
| Secondary mutations located on same chromosome as the primary mutation | 62 | 0.3% |
| Mutations located on single chromosome | 19 | 0.1% |
| Secondary mutations located on same chromosome as the primary mutation. Primary mutation p.V560D and e13 frameshift insertion located on different chromosomes | 18 | 0.1% |
| KIT mutations located on single chromosome | 2 | < 0.1% |
| KIT p.V559G primary mutation occurs on same chromosome as p.Y578C | 1 | < 0.1% |
| PDGFRA mutations located on different chromosomes | 1 | < 0.1% |
| Not known if KIT p.W557G primary mutation occurs on same chromosome as p.V569_Y578del | 1 | < 0.1% |
| KIT p.V559G primary mutation occurs on same chromosome as p.Y578C, and also with p.D579del | 1 | < 0.1% |
| Primary mutation p.V560D and e13 frameshift insertion located on different chromosomes | 1 | < 0.1% |
| (Missing) | 19266 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| on | 124 | |
| located | 121 | |
| chromosome | 104 | |
| mutations | 102 | |
| primary | 102 | |
| mutation | 102 | |
| same | 83 | |
| as | 83 | |
| secondary | 80 | |
| the | 80 | |
| Other values (21) | 178 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1053 | ||
| o | 926 | |
| a | 713 | 8.5% |
| t | 669 | 8.0% |
| m | 637 | 7.6% |
| e | 608 | 7.3% |
| n | 509 | 6.1% |
| s | 475 | 5.7% |
| r | 469 | 5.6% |
| i | 406 | 4.9% |
| Other values (38) | 1900 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 8365 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1053 | ||
| o | 926 | |
| a | 713 | 8.5% |
| t | 669 | 8.0% |
| m | 637 | 7.6% |
| e | 608 | 7.3% |
| n | 509 | 6.1% |
| s | 475 | 5.7% |
| r | 469 | 5.6% |
| i | 406 | 4.9% |
| Other values (38) | 1900 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 8365 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1053 | ||
| o | 926 | |
| a | 713 | 8.5% |
| t | 669 | 8.0% |
| m | 637 | 7.6% |
| e | 608 | 7.3% |
| n | 509 | 6.1% |
| s | 475 | 5.7% |
| r | 469 | 5.6% |
| i | 406 | 4.9% |
| Other values (38) | 1900 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 8365 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1053 | ||
| o | 926 | |
| a | 713 | 8.5% |
| t | 669 | 8.0% |
| m | 637 | 7.6% |
| e | 608 | 7.3% |
| n | 509 | 6.1% |
| s | 475 | 5.7% |
| r | 469 | 5.6% |
| i | 406 | 4.9% |
| Other values (38) | 1900 |
MSI
Categorical
Constant 
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.3 MiB |
| Unknown |
|---|
Length
| Max length | 7 |
|---|---|
| Median length | 7 |
| Mean length | 7 |
| Min length | 7 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Unknown |
|---|---|
| 2nd row | Unknown |
| 3rd row | Unknown |
| 4th row | Unknown |
| 5th row | Unknown |
Common Values
| Value | Count | Frequency (%) |
| Unknown | 19372 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| unknown | 19372 |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 58116 | |
| U | 19372 | 14.3% |
| k | 19372 | 14.3% |
| o | 19372 | 14.3% |
| w | 19372 | 14.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 135604 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| n | 58116 | |
| U | 19372 | 14.3% |
| k | 19372 | 14.3% |
| o | 19372 | 14.3% |
| w | 19372 | 14.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 135604 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| n | 58116 | |
| U | 19372 | 14.3% |
| k | 19372 | 14.3% |
| o | 19372 | 14.3% |
| w | 19372 | 14.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 135604 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| n | 58116 | |
| U | 19372 | 14.3% |
| k | 19372 | 14.3% |
| o | 19372 | 14.3% |
| w | 19372 | 14.3% |
AVERAGE_PLOIDY
Unsupported
Missing  Rejected  Unsupported 
| Missing | 19372 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 302.7 KiB |
SAMPLE_REMARK
Unsupported
Missing  Rejected  Unsupported 
| Missing | 19372 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 302.7 KiB |
DRUG_RESPONSE
Text
Missing 
| Distinct | 66 |
|---|---|
| Distinct (%) | 3.6% |
| Missing | 17537 |
| Missing (%) | 90.5% |
| Memory size | 886.8 KiB |
Length
| Max length | 389 |
|---|---|
| Median length | 183 |
| Mean length | 47.583106 |
| Min length | 27 |
Unique
| Unique | 30 ? |
|---|---|
| Unique (%) | 1.6% |
Sample
| 1st row | Imatinib clinical primary non response (local progression) |
|---|---|
| 2nd row | Imatinib clinical resistant recurrence;Sunitinib clinical response - not further specified |
| 3rd row | Imatinib clinical resistant recurrence |
| 4th row | Imatinib clinical resistant recurrence |
| 5th row | Imatinib clinical response - not further specified |
| Value | Count | Frequency (%) |
| imatinib | 1822 | |
| clinical | 1797 | |
| response | 1088 | |
| resistant | 796 | |
| recurrence | 779 | |
| 695 | 6.3% | |
| not | 672 | 6.1% |
| further | 672 | 6.1% |
| specified | 671 | 6.1% |
| non | 372 | 3.4% |
| Other values (108) | 1628 |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 10784 | |
| 9181 | ||
| e | 8420 | |
| n | 8253 | |
| r | 6605 | 7.6% |
| c | 5973 | 6.8% |
| a | 5532 | 6.3% |
| t | 5481 | 6.3% |
| s | 5172 | 5.9% |
| l | 3986 | 4.6% |
| Other values (44) | 17928 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 87315 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| i | 10784 | |
| 9181 | ||
| e | 8420 | |
| n | 8253 | |
| r | 6605 | 7.6% |
| c | 5973 | 6.8% |
| a | 5532 | 6.3% |
| t | 5481 | 6.3% |
| s | 5172 | 5.9% |
| l | 3986 | 4.6% |
| Other values (44) | 17928 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 87315 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| i | 10784 | |
| 9181 | ||
| e | 8420 | |
| n | 8253 | |
| r | 6605 | 7.6% |
| c | 5973 | 6.8% |
| a | 5532 | 6.3% |
| t | 5481 | 6.3% |
| s | 5172 | 5.9% |
| l | 3986 | 4.6% |
| Other values (44) | 17928 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 87315 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| i | 10784 | |
| 9181 | ||
| e | 8420 | |
| n | 8253 | |
| r | 6605 | 7.6% |
| c | 5973 | 6.8% |
| a | 5532 | 6.3% |
| t | 5481 | 6.3% |
| s | 5172 | 5.9% |
| l | 3986 | 4.6% |
| Other values (44) | 17928 |
GRADE
Categorical
High correlation  Imbalance  Missing 
| Distinct | 3 |
|---|---|
| Distinct (%) | 2.1% |
| Missing | 19231 |
| Missing (%) | 99.3% |
| Memory size | 1.3 MiB |
| Some Grade data are given in publication | |
|---|---|
| 3 | 2 |
| 2 | 1 |
Length
| Max length | 40 |
|---|---|
| Median length | 40 |
| Mean length | 39.170213 |
| Min length | 1 |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | 0.7% |
Sample
| 1st row | Some Grade data are given in publication |
|---|---|
| 2nd row | Some Grade data are given in publication |
| 3rd row | Some Grade data are given in publication |
| 4th row | Some Grade data are given in publication |
| 5th row | Some Grade data are given in publication |
Common Values
| Value | Count | Frequency (%) |
| Some Grade data are given in publication | 138 | 0.7% |
| 3 | 2 | < 0.1% |
| 2 | 1 | < 0.1% |
| (Missing) | 19231 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| some | 138 | |
| grade | 138 | |
| data | 138 | |
| are | 138 | |
| given | 138 | |
| in | 138 | |
| publication | 138 | |
| 3 | 2 | 0.2% |
| 2 | 1 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 828 | ||
| a | 690 | |
| e | 552 | |
| i | 552 | |
| n | 414 | 7.5% |
| t | 276 | 5.0% |
| r | 276 | 5.0% |
| o | 276 | 5.0% |
| d | 276 | 5.0% |
| m | 138 | 2.5% |
| Other values (11) | 1245 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 5523 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 828 | ||
| a | 690 | |
| e | 552 | |
| i | 552 | |
| n | 414 | 7.5% |
| t | 276 | 5.0% |
| r | 276 | 5.0% |
| o | 276 | 5.0% |
| d | 276 | 5.0% |
| m | 138 | 2.5% |
| Other values (11) | 1245 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 5523 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 828 | ||
| a | 690 | |
| e | 552 | |
| i | 552 | |
| n | 414 | 7.5% |
| t | 276 | 5.0% |
| r | 276 | 5.0% |
| o | 276 | 5.0% |
| d | 276 | 5.0% |
| m | 138 | 2.5% |
| Other values (11) | 1245 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 5523 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 828 | ||
| a | 690 | |
| e | 552 | |
| i | 552 | |
| n | 414 | 7.5% |
| t | 276 | 5.0% |
| r | 276 | 5.0% |
| o | 276 | 5.0% |
| d | 276 | 5.0% |
| m | 138 | 2.5% |
| Other values (11) | 1245 |
Missing 
| Distinct | 29 |
|---|---|
| Distinct (%) | 78.4% |
| Missing | 19335 |
| Missing (%) | 99.8% |
| Memory size | 758.8 KiB |
Length
| Max length | 133 |
|---|---|
| Median length | 70 |
| Mean length | 33 |
| Min length | 2 |
Unique
| Unique | 23 ? |
|---|---|
| Unique (%) | 62.2% |
Sample
| 1st row | 33 months after initial diagnosis |
|---|---|
| 2nd row | 3 years after primary diagnosis |
| 3rd row | 3 years after primary diagnosis |
| 4th row | 57 months after nilotinib therapy started |
| 5th row | 24 months after initial diagnosis |
| Value | Count | Frequency (%) |
| after | 27 | 14.3% |
| months | 21 | 11.1% |
| diagnosis | 15 | 7.9% |
| therapy | 11 | 5.8% |
| started | 9 | 4.8% |
| years | 8 | 4.2% |
| initial | 8 | 4.2% |
| imatinib | 6 | 3.2% |
| treatment | 4 | 2.1% |
| 15 | 4 | 2.1% |
| Other values (45) | 76 |
Most occurring characters
| Value | Count | Frequency (%) |
| 153 | ||
| t | 120 | 9.8% |
| i | 114 | 9.3% |
| a | 104 | 8.5% |
| s | 86 | 7.0% |
| r | 83 | 6.8% |
| e | 83 | 6.8% |
| n | 82 | 6.7% |
| o | 54 | 4.4% |
| m | 36 | 2.9% |
| Other values (32) | 306 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1221 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 153 | ||
| t | 120 | 9.8% |
| i | 114 | 9.3% |
| a | 104 | 8.5% |
| s | 86 | 7.0% |
| r | 83 | 6.8% |
| e | 83 | 6.8% |
| n | 82 | 6.7% |
| o | 54 | 4.4% |
| m | 36 | 2.9% |
| Other values (32) | 306 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1221 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 153 | ||
| t | 120 | 9.8% |
| i | 114 | 9.3% |
| a | 104 | 8.5% |
| s | 86 | 7.0% |
| r | 83 | 6.8% |
| e | 83 | 6.8% |
| n | 82 | 6.7% |
| o | 54 | 4.4% |
| m | 36 | 2.9% |
| Other values (32) | 306 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1221 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 153 | ||
| t | 120 | 9.8% |
| i | 114 | 9.3% |
| a | 104 | 8.5% |
| s | 86 | 7.0% |
| r | 83 | 6.8% |
| e | 83 | 6.8% |
| n | 82 | 6.7% |
| o | 54 | 4.4% |
| m | 36 | 2.9% |
| Other values (32) | 306 |
STAGE
Categorical
High correlation  Imbalance  Missing 
| Distinct | 7 |
|---|---|
| Distinct (%) | 2.5% |
| Missing | 19097 |
| Missing (%) | 98.6% |
| Memory size | 1.3 MiB |
| Some Stage data are given in publication | |
|---|---|
| T3 | 16 |
| IV | 5 |
| T4 | 5 |
| T2 | 2 |
| Other values (2) | 3 |
Length
| Max length | 40 |
|---|---|
| Median length | 40 |
| Mean length | 35.716364 |
| Min length | 2 |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | 0.4% |
Sample
| 1st row | Some Stage data are given in publication |
|---|---|
| 2nd row | Some Stage data are given in publication |
| 3rd row | Some Stage data are given in publication |
| 4th row | Some Stage data are given in publication |
| 5th row | Some Stage data are given in publication |
Common Values
| Value | Count | Frequency (%) |
| Some Stage data are given in publication | 244 | 1.3% |
| T3 | 16 | 0.1% |
| IV | 5 | < 0.1% |
| T4 | 5 | < 0.1% |
| T2 | 2 | < 0.1% |
| II | 2 | < 0.1% |
| IA | 1 | < 0.1% |
| (Missing) | 19097 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| some | 244 | |
| stage | 244 | |
| data | 244 | |
| are | 244 | |
| given | 244 | |
| in | 244 | |
| publication | 244 | |
| t3 | 16 | 0.9% |
| iv | 5 | 0.3% |
| t4 | 5 | 0.3% |
| Other values (3) | 5 | 0.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1464 | ||
| a | 1220 | |
| e | 976 | |
| i | 976 | |
| t | 732 | 7.5% |
| n | 732 | 7.5% |
| o | 488 | 5.0% |
| g | 488 | 5.0% |
| S | 488 | 5.0% |
| d | 244 | 2.5% |
| Other values (15) | 2014 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 9822 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1464 | ||
| a | 1220 | |
| e | 976 | |
| i | 976 | |
| t | 732 | 7.5% |
| n | 732 | 7.5% |
| o | 488 | 5.0% |
| g | 488 | 5.0% |
| S | 488 | 5.0% |
| d | 244 | 2.5% |
| Other values (15) | 2014 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 9822 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1464 | ||
| a | 1220 | |
| e | 976 | |
| i | 976 | |
| t | 732 | 7.5% |
| n | 732 | 7.5% |
| o | 488 | 5.0% |
| g | 488 | 5.0% |
| S | 488 | 5.0% |
| d | 244 | 2.5% |
| Other values (15) | 2014 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 9822 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1464 | ||
| a | 1220 | |
| e | 976 | |
| i | 976 | |
| t | 732 | 7.5% |
| n | 732 | 7.5% |
| o | 488 | 5.0% |
| g | 488 | 5.0% |
| S | 488 | 5.0% |
| d | 244 | 2.5% |
| Other values (15) | 2014 |
CYTOGENETICS
Categorical
High correlation  Missing 
| Distinct | 3 |
|---|---|
| Distinct (%) | 7.1% |
| Missing | 19330 |
| Missing (%) | 99.8% |
| Memory size | 1.3 MiB |
| del(22) | |
|---|---|
| ALK fusion negative | |
| Normal | 3 |
Length
| Max length | 19 |
|---|---|
| Median length | 7 |
| Mean length | 8.0714286 |
| Min length | 6 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Normal |
|---|---|
| 2nd row | del(22) |
| 3rd row | del(22) |
| 4th row | del(22) |
| 5th row | del(22) |
Common Values
| Value | Count | Frequency (%) |
| del(22) | 35 | 0.2% |
| ALK fusion negative | 4 | < 0.1% |
| Normal | 3 | < 0.1% |
| (Missing) | 19330 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| del(22 | 35 | |
| alk | 4 | 8.0% |
| fusion | 4 | 8.0% |
| negative | 4 | 8.0% |
| normal | 3 | 6.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 70 | |
| e | 43 | |
| l | 38 | |
| d | 35 | |
| ( | 35 | |
| ) | 35 | |
| 8 | 2.4% | |
| n | 8 | 2.4% |
| i | 8 | 2.4% |
| a | 7 | 2.1% |
| Other values (13) | 52 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 339 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 2 | 70 | |
| e | 43 | |
| l | 38 | |
| d | 35 | |
| ( | 35 | |
| ) | 35 | |
| 8 | 2.4% | |
| n | 8 | 2.4% |
| i | 8 | 2.4% |
| a | 7 | 2.1% |
| Other values (13) | 52 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 339 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 2 | 70 | |
| e | 43 | |
| l | 38 | |
| d | 35 | |
| ( | 35 | |
| ) | 35 | |
| 8 | 2.4% | |
| n | 8 | 2.4% |
| i | 8 | 2.4% |
| a | 7 | 2.1% |
| Other values (13) | 52 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 339 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 2 | 70 | |
| e | 43 | |
| l | 38 | |
| d | 35 | |
| ( | 35 | |
| ) | 35 | |
| 8 | 2.4% | |
| n | 8 | 2.4% |
| i | 8 | 2.4% |
| a | 7 | 2.1% |
| Other values (13) | 52 |
METASTATIC_SITE
Categorical
High correlation  Missing 
| Distinct | 44 |
|---|---|
| Distinct (%) | 9.6% |
| Missing | 18912 |
| Missing (%) | 97.6% |
| Memory size | 1.3 MiB |
| liver | |
|---|---|
| peritoneum | |
| NS | |
| omentum | 14 |
| stomach | 12 |
| Other values (39) |
Length
| Max length | 43 |
|---|---|
| Median length | 22 |
| Mean length | 6.9391304 |
| Min length | 2 |
Unique
| Unique | 23 ? |
|---|---|
| Unique (%) | 5.0% |
Sample
| 1st row | liver |
|---|---|
| 2nd row | liver |
| 3rd row | NS |
| 4th row | peritoneum |
| 5th row | stomach |
Common Values
| Value | Count | Frequency (%) |
| liver | 140 | 0.7% |
| peritoneum | 128 | 0.7% |
| NS | 75 | 0.4% |
| omentum | 14 | 0.1% |
| stomach | 12 | 0.1% |
| mesentery | 11 | 0.1% |
| abdomen | 10 | 0.1% |
| colon | 7 | < 0.1% |
| pelvis | 5 | < 0.1% |
| lymph node | 4 | < 0.1% |
| Other values (34) | 54 | 0.3% |
| (Missing) | 18912 |
Length
| Value | Count | Frequency (%) |
| liver | 140 | |
| peritoneum | 128 | |
| ns | 75 | |
| omentum | 15 | 3.1% |
| abdomen | 13 | 2.7% |
| stomach | 12 | 2.5% |
| mesentery | 11 | 2.3% |
| colon | 7 | 1.4% |
| skin | 6 | 1.2% |
| pelvis | 5 | 1.0% |
| Other values (38) | 72 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 517 | |
| i | 328 | |
| r | 313 | |
| n | 224 | 7.0% |
| m | 222 | 7.0% |
| o | 206 | 6.5% |
| t | 205 | 6.4% |
| l | 201 | 6.3% |
| u | 163 | 5.1% |
| p | 163 | 5.1% |
| Other values (18) | 650 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 3192 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 517 | |
| i | 328 | |
| r | 313 | |
| n | 224 | 7.0% |
| m | 222 | 7.0% |
| o | 206 | 6.5% |
| t | 205 | 6.4% |
| l | 201 | 6.3% |
| u | 163 | 5.1% |
| p | 163 | 5.1% |
| Other values (18) | 650 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 3192 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 517 | |
| i | 328 | |
| r | 313 | |
| n | 224 | 7.0% |
| m | 222 | 7.0% |
| o | 206 | 6.5% |
| t | 205 | 6.4% |
| l | 201 | 6.3% |
| u | 163 | 5.1% |
| p | 163 | 5.1% |
| Other values (18) | 650 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 3192 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 517 | |
| i | 328 | |
| r | 313 | |
| n | 224 | 7.0% |
| m | 222 | 7.0% |
| o | 206 | 6.5% |
| t | 205 | 6.4% |
| l | 201 | 6.3% |
| u | 163 | 5.1% |
| p | 163 | 5.1% |
| Other values (18) | 650 |
TUMOUR_REMARK
Categorical
High correlation  Missing 
| Distinct | 44 |
|---|---|
| Distinct (%) | 4.7% |
| Missing | 18428 |
| Missing (%) | 95.1% |
| Memory size | 1.4 MiB |
| Sample has a KIT exon 11 mutation (Sample has a Wildtype KIT exon 11) | |
|---|---|
| Tumour preselected as being negative for mutations in KIT e9, e11, e13, e17 and PDGFRA e12 and e18. (Tumour preselected as being negative for mutations in KIT e9, e11, e13, e17 and PDGFRA e12 and e18.) | |
| Sample has Wildtype KIT exons 9 and 11 (Sample has Wildtype KIT exons 9 and 11) | |
| Sample has a KIT exon 11 mutation (Sample has a KIT exon 11 mutation) | |
| Advanced GIST | |
| Other values (39) |
Length
| Max length | 201 |
|---|---|
| Median length | 175 |
| Mean length | 91.681144 |
| Min length | 13 |
Unique
| Unique | 27 ? |
|---|---|
| Unique (%) | 2.9% |
Sample
| 1st row | Tumour preselected as being negative for mutations in KIT e9, e11, e13, e17 and PDGFRA e12 and e18. (Tumour preselected as being negative for mutations in KIT e9, e11, e13, e17 and PDGFRA e12 and e18.) |
|---|---|
| 2nd row | Sample has a KIT exon 11 mutation (Sample has a Wildtype KIT exon 11) |
| 3rd row | Sample has a KIT exon 11 mutation (Sample has a Wildtype KIT exon 11) |
| 4th row | Sample has a KIT exon 11 mutation (Sample has a Wildtype KIT exon 11) |
| 5th row | Sample has a KIT exon 11 mutation (Sample has a Wildtype KIT exon 11) |
Common Values
| Value | Count | Frequency (%) |
| Sample has a KIT exon 11 mutation (Sample has a Wildtype KIT exon 11) | 353 | 1.8% |
| Tumour preselected as being negative for mutations in KIT e9, e11, e13, e17 and PDGFRA e12 and e18. (Tumour preselected as being negative for mutations in KIT e9, e11, e13, e17 and PDGFRA e12 and e18.) | 200 | 1.0% |
| Sample has Wildtype KIT exons 9 and 11 (Sample has Wildtype KIT exons 9 and 11) | 94 | 0.5% |
| Sample has a KIT exon 11 mutation (Sample has a KIT exon 11 mutation) | 60 | 0.3% |
| Advanced GIST | 58 | 0.3% |
| Succinate B dehydrogenase positive tumour | 38 | 0.2% |
| multifocal (multifocal) | 24 | 0.1% |
| At time of surgery (GIST ruptured before or during surgery) | 21 | 0.1% |
| Juvenile JPA (KIT negative) | 20 | 0.1% |
| Multicentric GIST (Multicentric GIST) | 12 | 0.1% |
| Other values (34) | 64 | 0.3% |
| (Missing) | 18428 |
Length
| Value | Count | Frequency (%) |
| kit | 1434 | 9.0% |
| has | 1016 | 6.4% |
| sample | 1014 | 6.4% |
| 11 | 1014 | 6.4% |
| and | 998 | 6.3% |
| a | 828 | 5.2% |
| exon | 826 | 5.2% |
| wildtype | 541 | 3.4% |
| tumour | 504 | 3.2% |
| mutation | 475 | 3.0% |
| Other values (107) | 7235 |
Most occurring characters
| Value | Count | Frequency (%) |
| 14942 | ||
| e | 8712 | 10.1% |
| a | 5966 | 6.9% |
| n | 4524 | 5.2% |
| 1 | 4452 | 5.1% |
| t | 3697 | 4.3% |
| i | 3307 | 3.8% |
| o | 3282 | 3.8% |
| s | 2741 | 3.2% |
| m | 2632 | 3.0% |
| Other values (46) | 32292 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 86547 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 14942 | ||
| e | 8712 | 10.1% |
| a | 5966 | 6.9% |
| n | 4524 | 5.2% |
| 1 | 4452 | 5.1% |
| t | 3697 | 4.3% |
| i | 3307 | 3.8% |
| o | 3282 | 3.8% |
| s | 2741 | 3.2% |
| m | 2632 | 3.0% |
| Other values (46) | 32292 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 86547 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 14942 | ||
| e | 8712 | 10.1% |
| a | 5966 | 6.9% |
| n | 4524 | 5.2% |
| 1 | 4452 | 5.1% |
| t | 3697 | 4.3% |
| i | 3307 | 3.8% |
| o | 3282 | 3.8% |
| s | 2741 | 3.2% |
| m | 2632 | 3.0% |
| Other values (46) | 32292 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 86547 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 14942 | ||
| e | 8712 | 10.1% |
| a | 5966 | 6.9% |
| n | 4524 | 5.2% |
| 1 | 4452 | 5.1% |
| t | 3697 | 4.3% |
| i | 3307 | 3.8% |
| o | 3282 | 3.8% |
| s | 2741 | 3.2% |
| m | 2632 | 3.0% |
| Other values (46) | 32292 |
ETHNICITY
Categorical
High correlation  Missing 
| Distinct | 14 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 17527 |
| Missing (%) | 90.5% |
| Memory size | 1.3 MiB |
| Chinese | |
|---|---|
| Korean | |
| Slovakian | |
| Japanese | |
| Italian | |
| Other values (9) |
Length
| Max length | 10 |
|---|---|
| Median length | 9 |
| Mean length | 7.4243902 |
| Min length | 5 |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | Taiwanese |
|---|---|
| 2nd row | Slovakian |
| 3rd row | Caucasian |
| 4th row | Italian |
| 5th row | Slovakian |
Common Values
| Value | Count | Frequency (%) |
| Chinese | 724 | 3.7% |
| Korean | 333 | 1.7% |
| Slovakian | 278 | 1.4% |
| Japanese | 168 | 0.9% |
| Italian | 83 | 0.4% |
| Portuguese | 78 | 0.4% |
| Caucasian | 42 | 0.2% |
| Panamanian | 39 | 0.2% |
| Taiwanese | 38 | 0.2% |
| Greek | 38 | 0.2% |
| Other values (4) | 24 | 0.1% |
| (Missing) | 17527 |
Length
| Value | Count | Frequency (%) |
| chinese | 724 | |
| korean | 333 | |
| slovakian | 278 | 15.1% |
| japanese | 168 | 9.1% |
| italian | 83 | 4.5% |
| portuguese | 78 | 4.2% |
| caucasian | 42 | 2.3% |
| panamanian | 39 | 2.1% |
| taiwanese | 38 | 2.1% |
| greek | 38 | 2.1% |
| Other values (4) | 24 | 1.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 2425 | |
| n | 1807 | |
| a | 1774 | |
| i | 1228 | |
| s | 1069 | |
| C | 766 | 5.6% |
| h | 724 | 5.3% |
| o | 690 | 5.0% |
| r | 449 | 3.3% |
| l | 364 | 2.7% |
| Other values (21) | 2402 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 13698 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 2425 | |
| n | 1807 | |
| a | 1774 | |
| i | 1228 | |
| s | 1069 | |
| C | 766 | 5.6% |
| h | 724 | 5.3% |
| o | 690 | 5.0% |
| r | 449 | 3.3% |
| l | 364 | 2.7% |
| Other values (21) | 2402 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 13698 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 2425 | |
| n | 1807 | |
| a | 1774 | |
| i | 1228 | |
| s | 1069 | |
| C | 766 | 5.6% |
| h | 724 | 5.3% |
| o | 690 | 5.0% |
| r | 449 | 3.3% |
| l | 364 | 2.7% |
| Other values (21) | 2402 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 13698 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 2425 | |
| n | 1807 | |
| a | 1774 | |
| i | 1228 | |
| s | 1069 | |
| C | 766 | 5.6% |
| h | 724 | 5.3% |
| o | 690 | 5.0% |
| r | 449 | 3.3% |
| l | 364 | 2.7% |
| Other values (21) | 2402 |
ENVIRONMENTAL_VARIABLES
Categorical
High correlation  Missing 
| Distinct | 3 |
|---|---|
| Distinct (%) | 3.4% |
| Missing | 19284 |
| Missing (%) | 99.5% |
| Memory size | 1.3 MiB |
| Non-smoker | |
|---|---|
| Smoker | |
| Non smoker | 2 |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 8.9090909 |
| Min length | 6 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Non-smoker |
|---|---|
| 2nd row | Non-smoker |
| 3rd row | Non-smoker |
| 4th row | Smoker |
| 5th row | Non-smoker |
Common Values
| Value | Count | Frequency (%) |
| Non-smoker | 62 | 0.3% |
| Smoker | 24 | 0.1% |
| Non smoker | 2 | < 0.1% |
| (Missing) | 19284 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| non-smoker | 62 | |
| smoker | 26 | |
| non | 2 | 2.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 152 | |
| m | 88 | |
| k | 88 | |
| r | 88 | |
| e | 88 | |
| n | 64 | |
| N | 64 | |
| s | 64 | |
| - | 62 | |
| S | 24 | 3.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 784 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| o | 152 | |
| m | 88 | |
| k | 88 | |
| r | 88 | |
| e | 88 | |
| n | 64 | |
| N | 64 | |
| s | 64 | |
| - | 62 | |
| S | 24 | 3.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 784 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| o | 152 | |
| m | 88 | |
| k | 88 | |
| r | 88 | |
| e | 88 | |
| n | 64 | |
| N | 64 | |
| s | 64 | |
| - | 62 | |
| S | 24 | 3.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 784 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| o | 152 | |
| m | 88 | |
| k | 88 | |
| r | 88 | |
| e | 88 | |
| n | 64 | |
| N | 64 | |
| s | 64 | |
| - | 62 | |
| S | 24 | 3.1% |
GERMLINE_MUTATION
Categorical
High correlation  Missing 
| Distinct | 5 |
|---|---|
| Distinct (%) | 31.2% |
| Missing | 19356 |
| Missing (%) | 99.9% |
| Memory size | 1.3 MiB |
| SDHA | |
|---|---|
| SDHD | |
| SDHD (c.448_449insATCT/p.C150Yfs*42) | |
| SDHA (Unknown which curated mutation is germline and which is somatic. ) | 1 |
| SDHB | 1 |
Length
| Max length | 72 |
|---|---|
| Median length | 4 |
| Mean length | 12.25 |
| Min length | 4 |
Unique
| Unique | 2 ? |
|---|---|
| Unique (%) | 12.5% |
Sample
| 1st row | SDHA |
|---|---|
| 2nd row | SDHA |
| 3rd row | SDHA |
| 4th row | SDHA |
| 5th row | SDHA |
Common Values
| Value | Count | Frequency (%) |
| SDHA | 10 | 0.1% |
| SDHD | 2 | < 0.1% |
| SDHD (c.448_449insATCT/p.C150Yfs*42) | 2 | < 0.1% |
| SDHA (Unknown which curated mutation is germline and which is somatic. ) | 1 | < 0.1% |
| SDHB | 1 | < 0.1% |
| (Missing) | 19356 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| sdha | 11 | |
| sdhd | 4 | 13.8% |
| c.448_449insatct/p.c150yfs*42 | 2 | 6.9% |
| which | 2 | 6.9% |
| is | 2 | 6.9% |
| unknown | 1 | 3.4% |
| curated | 1 | 3.4% |
| mutation | 1 | 3.4% |
| germline | 1 | 3.4% |
| and | 1 | 3.4% |
| Other values (3) | 3 | 10.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| D | 20 | 10.2% |
| S | 16 | 8.2% |
| H | 16 | 8.2% |
| A | 13 | 6.6% |
| 13 | 6.6% | |
| 4 | 10 | 5.1% |
| i | 9 | 4.6% |
| n | 8 | 4.1% |
| s | 7 | 3.6% |
| c | 6 | 3.1% |
| Other values (32) | 78 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 196 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| D | 20 | 10.2% |
| S | 16 | 8.2% |
| H | 16 | 8.2% |
| A | 13 | 6.6% |
| 13 | 6.6% | |
| 4 | 10 | 5.1% |
| i | 9 | 4.6% |
| n | 8 | 4.1% |
| s | 7 | 3.6% |
| c | 6 | 3.1% |
| Other values (32) | 78 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 196 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| D | 20 | 10.2% |
| S | 16 | 8.2% |
| H | 16 | 8.2% |
| A | 13 | 6.6% |
| 13 | 6.6% | |
| 4 | 10 | 5.1% |
| i | 9 | 4.6% |
| n | 8 | 4.1% |
| s | 7 | 3.6% |
| c | 6 | 3.1% |
| Other values (32) | 78 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 196 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| D | 20 | 10.2% |
| S | 16 | 8.2% |
| H | 16 | 8.2% |
| A | 13 | 6.6% |
| 13 | 6.6% | |
| 4 | 10 | 5.1% |
| i | 9 | 4.6% |
| n | 8 | 4.1% |
| s | 7 | 3.6% |
| c | 6 | 3.1% |
| Other values (32) | 78 |
THERAPY
Categorical
High correlation  Imbalance  Missing 
| Distinct | 10 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 18066 |
| Missing (%) | 93.3% |
| Memory size | 1.4 MiB |
| No prior imatinib therapy | |
|---|---|
| No prior therapy | |
| Possible Imatinib intolerant individual rather than primary or secondary non-response | 18 |
| Prior imatinib therapy | 17 |
| GIST treated with imatinib prior to desmoid tumour development | 11 |
| Other values (5) | 5 |
Length
| Max length | 96 |
|---|---|
| Median length | 25 |
| Mean length | 25.24732 |
| Min length | 16 |
Unique
| Unique | 5 ? |
|---|---|
| Unique (%) | 0.4% |
Sample
| 1st row | No prior imatinib therapy |
|---|---|
| 2nd row | No prior imatinib therapy |
| 3rd row | No prior imatinib therapy |
| 4th row | No prior imatinib therapy |
| 5th row | No prior imatinib therapy |
Common Values
| Value | Count | Frequency (%) |
| No prior imatinib therapy | 1100 | 5.7% |
| No prior therapy | 155 | 0.8% |
| Possible Imatinib intolerant individual rather than primary or secondary non-response | 18 | 0.1% |
| Prior imatinib therapy | 17 | 0.1% |
| GIST treated with imatinib prior to desmoid tumour development | 11 | 0.1% |
| Imatinib and erlotinib therapy alternated 2:4 weeks respectively | 1 | < 0.1% |
| Individual has been treated with sunitinib after progression on imatinib | 1 | < 0.1% |
| Prior imatinib therapy (intolerent of high dose imatinib therapy or progressive disaese) | 1 | < 0.1% |
| Individual was treated with surgery and neoadjuvant imatinib for gastrointestinal stromal tumour | 1 | < 0.1% |
| Individual has been treated with radiotherapy for endometrial carcinoma 7 years earlier | 1 | < 0.1% |
| (Missing) | 18066 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| prior | 1284 | |
| therapy | 1275 | |
| no | 1255 | |
| imatinib | 1151 | |
| individual | 21 | 0.4% |
| or | 19 | 0.4% |
| intolerant | 18 | 0.3% |
| rather | 18 | 0.3% |
| than | 18 | 0.3% |
| possible | 18 | 0.3% |
| Other values (40) | 172 | 3.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 4874 | |
| r | 4036 | |
| 3943 | ||
| o | 2709 | |
| p | 2592 | |
| t | 2589 | |
| a | 2572 | |
| e | 1485 | 4.5% |
| h | 1330 | 4.0% |
| n | 1327 | 4.0% |
| Other values (27) | 5516 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 32973 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| i | 4874 | |
| r | 4036 | |
| 3943 | ||
| o | 2709 | |
| p | 2592 | |
| t | 2589 | |
| a | 2572 | |
| e | 1485 | 4.5% |
| h | 1330 | 4.0% |
| n | 1327 | 4.0% |
| Other values (27) | 5516 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 32973 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| i | 4874 | |
| r | 4036 | |
| 3943 | ||
| o | 2709 | |
| p | 2592 | |
| t | 2589 | |
| a | 2572 | |
| e | 1485 | 4.5% |
| h | 1330 | 4.0% |
| n | 1327 | 4.0% |
| Other values (27) | 5516 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 32973 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| i | 4874 | |
| r | 4036 | |
| 3943 | ||
| o | 2709 | |
| p | 2592 | |
| t | 2589 | |
| a | 2572 | |
| e | 1485 | 4.5% |
| h | 1330 | 4.0% |
| n | 1327 | 4.0% |
| Other values (27) | 5516 |
FAMILY
Categorical
High correlation  Imbalance  Missing 
| Distinct | 10 |
|---|---|
| Distinct (%) | 7.6% |
| Missing | 19240 |
| Missing (%) | 99.3% |
| Memory size | 1.3 MiB |
| Sample from an individual with type 1 neurofibromatosis | |
|---|---|
| No family history of gastrointestinal stromal tumour or paraganglioma | 9 |
| No family history of cancer | 3 |
| Colon cancer (father) | 1 |
| Brother of sample 1117717 | 1 |
| Other values (5) | 5 |
Length
| Max length | 123 |
|---|---|
| Median length | 55 |
| Mean length | 54.340909 |
| Min length | 17 |
Unique
| Unique | 7 ? |
|---|---|
| Unique (%) | 5.3% |
Sample
| 1st row | No family history of gastrointestinal stromal tumour or paraganglioma |
|---|---|
| 2nd row | Sample from an individual with type 1 neurofibromatosis |
| 3rd row | No family history of cancer |
| 4th row | Colon cancer (father) |
| 5th row | Sample from an individual with type 1 neurofibromatosis |
Common Values
| Value | Count | Frequency (%) |
| Sample from an individual with type 1 neurofibromatosis | 113 | 0.6% |
| No family history of gastrointestinal stromal tumour or paraganglioma | 9 | < 0.1% |
| No family history of cancer | 3 | < 0.1% |
| Colon cancer (father) | 1 | < 0.1% |
| Brother of sample 1117717 | 1 | < 0.1% |
| Son of sample 1029463 | 1 | < 0.1% |
| Individual from a family with no known family history of gastrointestinal stromal tumour, paraganglioma or pheochromocytoma | 1 | < 0.1% |
| Brother of sample 1117718 | 1 | < 0.1% |
| Father of sample 1029464 | 1 | < 0.1% |
| Lung cancer (FDR) | 1 | < 0.1% |
| (Missing) | 19240 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| sample | 117 | |
| from | 114 | |
| individual | 114 | |
| with | 114 | |
| an | 113 | |
| type | 113 | |
| 1 | 113 | |
| neurofibromatosis | 113 | |
| of | 17 | 1.6% |
| family | 14 | 1.3% |
| Other values (21) | 96 |
Most occurring characters
| Value | Count | Frequency (%) |
| 906 | ||
| i | 738 | 10.3% |
| o | 556 | 7.8% |
| a | 550 | 7.7% |
| r | 415 | 5.8% |
| t | 408 | 5.7% |
| m | 390 | 5.4% |
| n | 381 | 5.3% |
| e | 363 | 5.1% |
| l | 276 | 3.8% |
| Other values (34) | 2190 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 7173 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 906 | ||
| i | 738 | 10.3% |
| o | 556 | 7.8% |
| a | 550 | 7.7% |
| r | 415 | 5.8% |
| t | 408 | 5.7% |
| m | 390 | 5.4% |
| n | 381 | 5.3% |
| e | 363 | 5.1% |
| l | 276 | 3.8% |
| Other values (34) | 2190 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 7173 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 906 | ||
| i | 738 | 10.3% |
| o | 556 | 7.8% |
| a | 550 | 7.7% |
| r | 415 | 5.8% |
| t | 408 | 5.7% |
| m | 390 | 5.4% |
| n | 381 | 5.3% |
| e | 363 | 5.1% |
| l | 276 | 3.8% |
| Other values (34) | 2190 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 7173 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 906 | ||
| i | 738 | 10.3% |
| o | 556 | 7.8% |
| a | 550 | 7.7% |
| r | 415 | 5.8% |
| t | 408 | 5.7% |
| m | 390 | 5.4% |
| n | 381 | 5.3% |
| e | 363 | 5.1% |
| l | 276 | 3.8% |
| Other values (34) | 2190 |
INDIVIDUAL_REMARK
Categorical
High correlation  Missing 
| Distinct | 36 |
|---|---|
| Distinct (%) | 15.2% |
| Missing | 19135 |
| Missing (%) | 98.8% |
| Memory size | 1.3 MiB |
| Age=Adult 19-83 years | |
|---|---|
| Age=Adult | |
| Individual has Carney triad | |
| Individual has NF1 | |
| No history of FAP | |
| Other values (31) |
Length
| Max length | 91 |
|---|---|
| Median length | 75 |
| Mean length | 22.729958 |
| Min length | 6 |
Unique
| Unique | 21 ? |
|---|---|
| Unique (%) | 8.9% |
Sample
| 1st row | Individual was under 21 |
|---|---|
| 2nd row | Age=Adult 19-83 years |
| 3rd row | Age=Adult 19-83 years |
| 4th row | Age=Adult 19-83 years |
| 5th row | Age=Middle-aged |
Common Values
| Value | Count | Frequency (%) |
| Age=Adult 19-83 years | 58 | 0.3% |
| Age=Adult | 39 | 0.2% |
| Individual has Carney triad | 34 | 0.2% |
| Individual has NF1 | 21 | 0.1% |
| No history of FAP | 13 | 0.1% |
| Individual was under 21 | 13 | 0.1% |
| No family history of NF1 | 10 | 0.1% |
| Sporadic NF1 | 7 | < 0.1% |
| Individual has neurofibromatosis type 1 | 6 | < 0.1% |
| Remark | 4 | < 0.1% |
| Other values (26) | 32 | 0.2% |
| (Missing) | 19135 |
Length
| Value | Count | Frequency (%) |
| age=adult | 97 | 12.1% |
| individual | 92 | 11.5% |
| has | 77 | 9.6% |
| years | 59 | 7.4% |
| 19-83 | 58 | 7.2% |
| carney | 40 | 5.0% |
| triad | 40 | 5.0% |
| nf1 | 39 | 4.9% |
| of | 26 | 3.2% |
| no | 24 | 3.0% |
| Other values (83) | 249 |
Most occurring characters
| Value | Count | Frequency (%) |
| 564 | 10.5% | |
| a | 440 | 8.2% |
| d | 389 | 7.2% |
| i | 352 | 6.5% |
| e | 319 | 5.9% |
| r | 269 | 5.0% |
| l | 249 | 4.6% |
| t | 231 | 4.3% |
| s | 224 | 4.2% |
| n | 224 | 4.2% |
| Other values (42) | 2126 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 5387 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 564 | 10.5% | |
| a | 440 | 8.2% |
| d | 389 | 7.2% |
| i | 352 | 6.5% |
| e | 319 | 5.9% |
| r | 269 | 5.0% |
| l | 249 | 4.6% |
| t | 231 | 4.3% |
| s | 224 | 4.2% |
| n | 224 | 4.2% |
| Other values (42) | 2126 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 5387 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 564 | 10.5% | |
| a | 440 | 8.2% |
| d | 389 | 7.2% |
| i | 352 | 6.5% |
| e | 319 | 5.9% |
| r | 269 | 5.0% |
| l | 249 | 4.6% |
| t | 231 | 4.3% |
| s | 224 | 4.2% |
| n | 224 | 4.2% |
| Other values (42) | 2126 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 5387 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 564 | 10.5% | |
| a | 440 | 8.2% |
| d | 389 | 7.2% |
| i | 352 | 6.5% |
| e | 319 | 5.9% |
| r | 269 | 5.0% |
| l | 249 | 4.6% |
| t | 231 | 4.3% |
| s | 224 | 4.2% |
| n | 224 | 4.2% |
| Other values (42) | 2126 |
Interactions
Correlations
| AGE | CYTOGENETICS | ENVIRONMENTAL_VARIABLES | ETHNICITY | FAMILY | GENDER | GERMLINE_MUTATION | GRADE | INDIVIDUAL_ID | INDIVIDUAL_REMARK | METASTATIC_SITE | MUTATION_ALLELE_SPECIFICATION | NORMAL_TISSUE_TESTED | RNASEQ_SCREEN | SAMPLE_TYPE | STAGE | TARGETED_SCREEN | THERAPY | TUMOUR_ID | TUMOUR_REMARK | TUMOUR_SOURCE | WHOLE_EXOME_SCREEN | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| AGE | 1.000 | 0.155 | 0.000 | 0.225 | 0.311 | 0.104 | 0.000 | 0.000 | -0.024 | 0.447 | 0.106 | 0.158 | 0.144 | 0.000 | 0.105 | 0.175 | 0.034 | 0.095 | -0.034 | 0.484 | 0.115 | 0.051 |
| CYTOGENETICS | 0.155 | 1.000 | 0.000 | 0.000 | 0.000 | 0.704 | 0.000 | 0.000 | 0.987 | 0.000 | 0.519 | 0.000 | 0.000 | 1.000 | 0.689 | 0.000 | 0.445 | 0.000 | 0.987 | 0.000 | 0.345 | 0.445 |
| ENVIRONMENTAL_VARIABLES | 0.000 | 0.000 | 1.000 | 0.000 | 0.000 | 0.171 | 0.000 | 0.000 | 0.721 | 0.000 | 0.205 | 0.000 | 0.057 | 1.000 | 0.185 | NaN | 1.000 | 0.000 | 0.721 | 0.000 | 0.701 | 1.000 |
| ETHNICITY | 0.225 | 0.000 | 0.000 | 1.000 | NaN | 0.448 | 0.000 | 0.000 | 0.716 | 0.347 | 0.667 | 1.000 | 0.962 | 1.000 | 0.373 | 0.000 | 1.000 | 0.963 | 0.692 | 0.000 | 0.581 | 1.000 |
| FAMILY | 0.311 | 0.000 | 0.000 | NaN | 1.000 | 0.192 | 1.000 | 1.000 | 0.795 | 0.000 | 1.000 | 0.000 | 0.000 | 1.000 | 0.670 | 0.000 | 1.000 | 0.000 | 0.788 | 1.000 | 0.198 | 1.000 |
| GENDER | 0.104 | 0.704 | 0.171 | 0.448 | 0.192 | 1.000 | 0.000 | 0.000 | 0.217 | 0.763 | 0.213 | 0.480 | 0.014 | 0.038 | 0.151 | 0.602 | 0.119 | 0.496 | 0.217 | 0.808 | 0.285 | 0.113 |
| GERMLINE_MUTATION | 0.000 | 0.000 | 0.000 | 0.000 | 1.000 | 0.000 | 1.000 | 0.000 | 0.732 | NaN | NaN | 0.000 | 0.886 | 1.000 | 0.482 | 0.000 | 1.000 | 0.000 | 0.732 | 0.000 | 0.519 | 1.000 |
| GRADE | 0.000 | 0.000 | 0.000 | 0.000 | 1.000 | 0.000 | 0.000 | 1.000 | 0.692 | 1.000 | 0.000 | 0.000 | 1.000 | 1.000 | 0.702 | 1.000 | 1.000 | 0.000 | 0.692 | 0.000 | 0.156 | 1.000 |
| INDIVIDUAL_ID | -0.024 | 0.987 | 0.721 | 0.716 | 0.795 | 0.217 | 0.732 | 0.692 | 1.000 | 0.850 | 0.399 | 0.837 | 0.638 | 0.037 | 0.283 | 0.564 | 0.147 | 0.582 | 0.996 | 0.976 | 0.333 | 0.157 |
| INDIVIDUAL_REMARK | 0.447 | 0.000 | 0.000 | 0.347 | 0.000 | 0.763 | NaN | 1.000 | 0.850 | 1.000 | 0.000 | 1.000 | 1.000 | 1.000 | 0.898 | NaN | 1.000 | 0.991 | 0.849 | 0.456 | 0.779 | 1.000 |
| METASTATIC_SITE | 0.106 | 0.519 | 0.205 | 0.667 | 1.000 | 0.213 | NaN | 0.000 | 0.399 | 0.000 | 1.000 | 1.000 | 0.219 | 1.000 | 0.295 | 0.000 | 0.357 | 0.488 | 0.388 | 1.000 | 0.525 | 0.357 |
| MUTATION_ALLELE_SPECIFICATION | 0.158 | 0.000 | 0.000 | 1.000 | 0.000 | 0.480 | 0.000 | 0.000 | 0.837 | 1.000 | 1.000 | 1.000 | 0.000 | 1.000 | 0.912 | 0.000 | 1.000 | 0.000 | 0.975 | 0.000 | 0.853 | 1.000 |
| NORMAL_TISSUE_TESTED | 0.144 | 0.000 | 0.057 | 0.962 | 0.000 | 0.014 | 0.886 | 1.000 | 0.638 | 1.000 | 0.219 | 0.000 | 1.000 | 0.027 | 0.241 | 0.816 | 0.133 | 1.000 | 0.638 | 0.000 | 0.464 | 0.125 |
| RNASEQ_SCREEN | 0.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.038 | 1.000 | 1.000 | 0.037 | 1.000 | 1.000 | 1.000 | 0.027 | 1.000 | 0.060 | 1.000 | 0.309 | 1.000 | 0.037 | 1.000 | 0.000 | 0.000 |
| SAMPLE_TYPE | 0.105 | 0.689 | 0.185 | 0.373 | 0.670 | 0.151 | 0.482 | 0.702 | 0.283 | 0.898 | 0.295 | 0.912 | 0.241 | 0.060 | 1.000 | 0.149 | 0.042 | 0.257 | 0.283 | 0.917 | 0.097 | 0.034 |
| STAGE | 0.175 | 0.000 | NaN | 0.000 | 0.000 | 0.602 | 0.000 | 1.000 | 0.564 | NaN | 0.000 | 0.000 | 0.816 | 1.000 | 0.149 | 1.000 | 1.000 | 0.000 | 0.564 | 0.000 | 0.529 | 1.000 |
| TARGETED_SCREEN | 0.034 | 0.445 | 1.000 | 1.000 | 1.000 | 0.119 | 1.000 | 1.000 | 0.147 | 1.000 | 0.357 | 1.000 | 0.133 | 0.309 | 0.042 | 1.000 | 1.000 | 1.000 | 0.137 | 1.000 | 0.101 | 0.934 |
| THERAPY | 0.095 | 0.000 | 0.000 | 0.963 | 0.000 | 0.496 | 0.000 | 0.000 | 0.582 | 0.991 | 0.488 | 0.000 | 1.000 | 1.000 | 0.257 | 0.000 | 1.000 | 1.000 | 0.582 | 0.922 | 0.609 | 1.000 |
| TUMOUR_ID | -0.034 | 0.987 | 0.721 | 0.692 | 0.788 | 0.217 | 0.732 | 0.692 | 0.996 | 0.849 | 0.388 | 0.975 | 0.638 | 0.037 | 0.283 | 0.564 | 0.137 | 0.582 | 1.000 | 0.974 | 0.330 | 0.148 |
| TUMOUR_REMARK | 0.484 | 0.000 | 0.000 | 0.000 | 1.000 | 0.808 | 0.000 | 0.000 | 0.976 | 0.456 | 1.000 | 0.000 | 0.000 | 1.000 | 0.917 | 0.000 | 1.000 | 0.922 | 0.974 | 1.000 | 0.977 | 1.000 |
| TUMOUR_SOURCE | 0.115 | 0.345 | 0.701 | 0.581 | 0.198 | 0.285 | 0.519 | 0.156 | 0.333 | 0.779 | 0.525 | 0.853 | 0.464 | 0.000 | 0.097 | 0.529 | 0.101 | 0.609 | 0.330 | 0.977 | 1.000 | 0.108 |
| WHOLE_EXOME_SCREEN | 0.051 | 0.445 | 1.000 | 1.000 | 1.000 | 0.113 | 1.000 | 1.000 | 0.157 | 1.000 | 0.357 | 1.000 | 0.125 | 0.000 | 0.034 | 1.000 | 0.934 | 1.000 | 0.148 | 1.000 | 0.108 | 1.000 |
Missing values
Sample
| COSMIC_SAMPLE_ID | SAMPLE_NAME | COSMIC_PHENOTYPE_ID | TUMOUR_ID | SAMPLE_TYPE | INDIVIDUAL_ID | WHOLE_GENOME_SCREEN | WHOLE_EXOME_SCREEN | TARGETED_SCREEN | RNASEQ_SCREEN | REARRANGEMENT_SCREEN | TUMOUR_SOURCE | NORMAL_TISSUE_TESTED | GENDER | AGE | THERAPY_RELATIONSHIP | SAMPLE_DIFFERENTIATOR | MUTATION_ALLELE_SPECIFICATION | MSI | AVERAGE_PLOIDY | SAMPLE_REMARK | DRUG_RESPONSE | GRADE | AGE_AT_TUMOUR_RECURRENCE | STAGE | CYTOGENETICS | METASTATIC_SITE | TUMOUR_REMARK | ETHNICITY | ENVIRONMENTAL_VARIABLES | GERMLINE_MUTATION | THERAPY | FAMILY | INDIVIDUAL_REMARK | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 11 | COSS1496935 | 1496935 | COSO31355381 | 1419814 | surgery-fixed | 1368880 | n | n | y | n | n | primary | NaN | u | NaN | NaN | NaN | NaN | Unknown | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 64 | COSS1036427 | 1036427 | COSO28815381 | 953184 | surgery-fixed | 930275 | n | n | y | n | n | NS | NaN | u | NaN | NaN | NaN | NaN | Unknown | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | Tumour preselected as being negative for mutations in KIT e9, e11, e13, e17 and PDGFRA e12 and e18. (Tumour preselected as being negative for mutations in KIT e9, e11, e13, e17 and PDGFRA e12 and e18.) | NaN | NaN | NaN | NaN | NaN | NaN |
| 120 | COSS1496985 | 1496985 | COSO31355381 | 1419864 | surgery-fixed | 1368930 | n | n | y | n | n | primary | NaN | u | NaN | NaN | NaN | NaN | Unknown | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 306 | COSS2468028 | 2468028 | COSO36075381 | 2330859 | surgery-fixed | 2180560 | n | n | y | n | n | primary | NaN | m | 72.0 | NaN | NaN | NaN | Unknown | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 387 | COSS820950 | E17588 | COSO28775381 | 740710 | surgery-fixed | 723216 | n | n | y | n | n | primary | n | u | NaN | NaN | NaN | NaN | Unknown | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 434 | COSS1731941 | 1731941 | COSO31355381 | 1637907 | surgery-fixed | 1573598 | n | n | y | n | n | NS | NaN | u | NaN | NaN | NaN | NaN | Unknown | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | No prior imatinib therapy | NaN | NaN |
| 528 | COSS1384396 | RYU07-20 | COSO36075381 | 1294437 | surgery - NOS | 1261383 | n | n | y | n | n | primary | NaN | u | NaN | NaN | NaN | NaN | Unknown | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 566 | COSS1544835 | 1544835 | COSO36605381 | 1466995 | surgery-fixed | 1413601 | n | n | y | n | n | NS | NaN | u | NaN | NaN | NaN | NaN | Unknown | NaN | NaN | Imatinib clinical primary non response (local progression) | NaN | NaN | NaN | NaN | NaN | NaN | Taiwanese | NaN | NaN | NaN | NaN | NaN |
| 719 | COSS2611118 | 2611118 | COSO31355381 | 2471936 | surgery-fixed | 2321371 | n | n | y | n | n | NS | NaN | u | NaN | NaN | NaN | NaN | Unknown | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 892 | COSS1284235 | 1284235 | COSO28815763 | 1195530 | surgery fresh/frozen | 1167235 | n | n | y | n | n | NS | NaN | f | 68.0 | NaN | NaN | NaN | Unknown | NaN | NaN | NaN | Some Grade data are given in publication | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| COSMIC_SAMPLE_ID | SAMPLE_NAME | COSMIC_PHENOTYPE_ID | TUMOUR_ID | SAMPLE_TYPE | INDIVIDUAL_ID | WHOLE_GENOME_SCREEN | WHOLE_EXOME_SCREEN | TARGETED_SCREEN | RNASEQ_SCREEN | REARRANGEMENT_SCREEN | TUMOUR_SOURCE | NORMAL_TISSUE_TESTED | GENDER | AGE | THERAPY_RELATIONSHIP | SAMPLE_DIFFERENTIATOR | MUTATION_ALLELE_SPECIFICATION | MSI | AVERAGE_PLOIDY | SAMPLE_REMARK | DRUG_RESPONSE | GRADE | AGE_AT_TUMOUR_RECURRENCE | STAGE | CYTOGENETICS | METASTATIC_SITE | TUMOUR_REMARK | ETHNICITY | ENVIRONMENTAL_VARIABLES | GERMLINE_MUTATION | THERAPY | FAMILY | INDIVIDUAL_REMARK | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1557588 | COSS2432123 | 2432123 | COSO31355381 | 2295026 | surgery-fixed | 2146615 | n | n | y | n | n | NS | NaN | u | NaN | Sample analysed before imatinib therapy | NaN | NaN | Unknown | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 1557594 | COSS2482076 | 2482076 | COSO31355381 | 2344806 | surgery fresh/frozen | 2194295 | n | n | y | n | n | NS | NaN | u | NaN | NaN | NaN | NaN | Unknown | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 1557680 | COSS1929167 | 1929167 | COSO36605381 | 1816457 | surgery - NOS | 1729772 | n | n | y | n | n | NS | NaN | u | NaN | NaN | NaN | NaN | Unknown | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 1557745 | COSS2503154 | 2503154 | COSO28775381 | 2365504 | surgery - NOS | 2216705 | n | n | y | n | n | primary | NaN | u | NaN | Sample analysed before imatinib therapy | NaN | NaN | Unknown | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | No prior imatinib therapy | NaN | NaN |
| 1557748 | COSS2431593 | 2431593 | COSO31355381 | 2294496 | surgery-fixed | 2146085 | n | n | y | n | n | NS | NaN | u | NaN | Sample analysed before imatinib therapy | NaN | NaN | Unknown | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 1557901 | COSS2906437 | 2906437 | COSO36605381 | 2760689 | surgery-fixed | 2596118 | n | n | y | n | n | primary | NaN | u | NaN | Sample taken before imatinib therapy | NaN | NaN | Unknown | NaN | NaN | Imatinib clinical response - not further specified | NaN | NaN | NaN | NaN | NaN | NaN | Chinese | NaN | NaN | NaN | NaN | NaN |
| 1558040 | COSS2402260 | 2402260 | COSO36605381 | 2265118 | surgery-fixed | 2115894 | n | n | y | n | n | NS | NaN | u | NaN | NaN | NaN | NaN | Unknown | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 1558051 | COSS1731926 | 1731926 | COSO31355381 | 1637892 | surgery-fixed | 1573583 | n | n | y | n | n | NS | NaN | u | NaN | NaN | NaN | NaN | Unknown | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | No prior imatinib therapy | NaN | NaN |
| 1558255 | COSS1479674 | 1479674 | COSO28815385 | 1403334 | surgery-fixed | 1353085 | n | n | y | n | n | primary | NaN | u | NaN | NaN | NaN | NaN | Unknown | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 1558260 | COSS1182421 | 1182421 | COSO28775381 | 1094392 | surgery-fixed | 1067674 | n | n | y | n | n | NS | NaN | m | 72.0 | NaN | NaN | NaN | Unknown | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | Urinary bladder carcinoma since 32 months |